Skip to content

Continuous Benchmarking #936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

Malmahrouqi3
Copy link
Collaborator

@Malmahrouqi3 Malmahrouqi3 commented Jul 11, 2025

User description

Description

Concerning (#462),
Intended to keep track of benchmark results (./mfc.sh bench) for performance-critical improvements. Since there is not a specific benchmark procedure, the four existing MFC benchmark cases' results are reported. To ensure standardized performance with no hardware-bias, all benchmarking occurs on a GitHub runner till figured later what resources/clusters/allocations/runners to utilize. Once poc is finalized, other stuff ought to be easy.

Debugging info,
Not much besides reviewing .md pages.

To-dos,

  • Proper dump of results into .json file(s) as per google benchmark framework requirements.
  • GitHub Page with live charts informing of performance details.
  • If the framework is too glitchy/complicated, having .json file(s) while continuously appending datapoints then plotting points from there should not be rather complex to implement I guess.

Note to Self:
Look into retrospectively record the previous 10-50 base repo commits to display invaluable datapoints.


PR Type

Enhancement


Description

  • Implement continuous benchmarking with GitHub Action workflow

  • Remove legacy cluster-specific benchmark scripts

  • Add Google Benchmark format conversion for MFC results

  • Create automated performance tracking and documentation


Changes diagram

flowchart LR
  A["Legacy Benchmark Scripts"] -- "removed" --> B["Continuous Benchmarking"]
  B --> C["GitHub Action Workflow"]
  C --> D["MFC Benchmark Execution"]
  D --> E["Google Benchmark Format"]
  E --> F["Performance Tracking"]
  F --> G["Documentation Generation"]
Loading

Changes walkthrough 📝

Relevant files
Enhancement
21 files
cont-bench.yml
Add continuous benchmarking GitHub Action workflow             
+149/-0 
bench.yml
Remove legacy benchmark workflow                                                 
+0/-110 
bench.sh
Remove Frontier cluster benchmark script                                 
+0/-16   
build.sh
Remove Frontier cluster build script                                         
+0/-17   
submit-bench.sh
Remove Frontier cluster benchmark submission script           
+0/-54   
submit.sh
Remove Frontier cluster submission script                               
+0/-55   
test.sh
Remove Frontier cluster test script                                           
+0/-10   
bench.sh
Remove Phoenix cluster benchmark script                                   
+0/-27   
submit-bench.sh
Remove Phoenix cluster benchmark submission script             
+0/-64   
submit.sh
Remove Phoenix cluster submission script                                 
+0/-64   
test.sh
Remove Phoenix cluster test script                                             
+0/-21   
cleanliness.yml
Remove code cleanliness workflow                                                 
+0/-127 
coverage.yml
Remove coverage check workflow                                                     
+0/-48   
docs.yml
Remove documentation workflow                                                       
+0/-76   
formatting.yml
Remove formatting workflow                                                             
+0/-19   
line-count.yml
Remove line count workflow                                                             
+0/-54   
lint-source.yml
Remove source linting workflow                                                     
+0/-55   
lint-toolchain.yml
Remove toolchain linting workflow                                               
+0/-17   
pmd.yml
Remove PMD source analysis workflow                                           
+0/-131 
spelling.yml
Remove spell check workflow                                                           
+0/-17   
test.yml
Remove test suite workflow                                                             
+0/-131 
Tests
5 files
test-components.sh
Add component testing script for benchmarks                           
+48/-0   
bench
Add benchmark results YAML file                                                   
+153/-0 
bench-google.json
Add Google Benchmark format JSON results                                 
+69/-0   
bench.json
Add benchmark results JSON file                                                   
+195/-0 
bench.yaml
Add benchmark results YAML file                                                   
+153/-0 
Configuration changes
1 files
.env
Add environment configuration with GitHub tokens                 
+2/-0     
Documentation
1 files
cont-bench.md
Add continuous benchmarking documentation                               
+154/-0 

Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • @Copilot Copilot AI review requested due to automatic review settings July 11, 2025 20:35
    @Malmahrouqi3 Malmahrouqi3 requested review from a team and sbryngelson as code owners July 11, 2025 20:35
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🎫 Ticket compliance analysis 🔶

    462 - Partially compliant

    Compliant requirements:

    • Use benchmark-action/github-action-benchmark to track benchmark CI results over time
    • Implement continuous performance monitoring

    Non-compliant requirements:

    • Keep track of benchmark results that are currently only tracked in PR CI

    Requires further human verification:

    • Verify that the GitHub Pages deployment works correctly for displaying benchmark charts
    • Confirm that the benchmark data format conversion produces accurate results
    • Test that the alert system functions properly when performance degrades

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 Security concerns

    Sensitive information exposure:
    The .env file contains hardcoded GitHub personal access tokens in plain text. These tokens provide repository access and should never be committed to version control. The tokens should be stored as GitHub repository secrets and accessed via ${{ secrets.GITHUB_TOKEN }} in the workflow instead.

    ⚡ Recommended focus areas for review

    Security Risk

    GitHub personal access token is hardcoded in plain text in the .env file, which poses a serious security vulnerability if committed to the repository.

    TOKEN=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
    GITHUB_TOKEN=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
    Possible Issue

    The workflow references 'bench.json' file in the Python conversion script but the actual benchmark output is 'bench.yaml'. This mismatch will cause the conversion step to fail.

    with open('bench.json', 'r') as f:
        mfc_data = json.load(f)
    Logic Error

    The workflow runs on all push/PR events but only processes simulation_time metrics, ignoring grind_time data that exists in the MFC output, leading to incomplete performance tracking.

    if 'simulation' in output_summary and 'exec' in output_summary['simulation']:
        benchmarks.append({
            "name": f"{case_name}/simulation_time",
            "family_index": len(benchmarks),
            "per_family_instance_index": 0,
            "run_name": f"{case_name}/simulation_time",
            "run_type": "iteration",
            "repetitions": 1,
            "repetition_index": 0,
            "threads": 1,
            "iterations": 1,
            "real_time": output_summary['simulation']['exec'] * 1e9,
            "cpu_time": output_summary['simulation']['exec'] * 1e9,
            "time_unit": "ns"
        })

    Copy link

    qodo-merge-pro bot commented Jul 11, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Security
    Remove hardcoded GitHub tokens

    The .env file contains hardcoded GitHub personal access tokens which pose a
    critical security risk. These tokens should never be committed to version
    control as they provide access to GitHub repositories and can be exploited by
    malicious actors.

    .env [1-2]

    -TOKEN=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
    -GITHUB_TOKEN=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
    +# Example environment variables - replace with your actual values
    +# TOKEN=your_github_token_here
    +# GITHUB_TOKEN=your_github_token_here
    Suggestion importance[1-10]: 10

    __

    Why: This is a critical security vulnerability, as committing hardcoded personal access tokens to version control can lead to unauthorized access to GitHub resources.

    High
    Possible issue
    Install GitHub CLI dependency

    The gh auth token command will fail because the GitHub CLI is not installed in
    the setup step. This will cause the workflow to fail when trying to export the
    TOKEN variable.

    .github/workflows/cont-bench.yml [41-49]

     - name: Setup
       run: | 
         sudo apt update -y
         sudo apt install -y cmake gcc g++ python3 python3-dev hdf5-tools \
    -              libfftw3-dev libhdf5-dev openmpi-bin libopenmpi-dev
    +              libfftw3-dev libhdf5-dev openmpi-bin libopenmpi-dev gh
         export TOKEN=$(gh auth token)
         sudo wget -qO /usr/local/bin/yq https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64
         sudo chmod +x /usr/local/bin/yq
         yq --version
    Suggestion importance[1-10]: 9

    __

    Why: The workflow will fail because it attempts to use the gh command without installing the GitHub CLI first, which is a blocking bug for this CI job.

    High
    Fix benchmark file path

    The Python script attempts to read bench.json from the current directory, but
    the file is actually located in the pr/ subdirectory based on the previous step.
    This will cause a FileNotFoundError.

    .github/workflows/cont-bench.yml [57-65]

     - name: Convert MFC to Google Benchmark Format
       run: |
         python3 << 'EOF'
         import json
         from datetime import datetime
     
         # Read the MFC benchmark data
    -    with open('bench.json', 'r') as f:
    +    with open('pr/bench.json', 'r') as f:
             mfc_data = json.load(f)
    Suggestion importance[1-10]: 9

    __

    Why: The script will fail with a FileNotFoundError because it tries to open bench.json in the wrong directory, which is a blocking bug for this CI job.

    High
    • Update

    Copy link
    Contributor

    @Copilot Copilot AI left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Pull Request Overview

    This PR introduces continuous benchmarking for performance-critical workflows, including a new test script, documentation updates, sample benchmark data, and a dedicated GitHub Actions workflow. It also removes existing CI workflows in favor of a single continuous benchmarking pipeline.

    • Added test-components.sh to validate component execution and JSON conversions.
    • Generated documentation docs/documentation/cont-bench.md with sample benchmark results.
    • Introduced bench.yaml/json, bench-google.json, and .github/workflows/cont-bench.yml to automate benchmark collection and Google Benchmark conversion.
    • Note: an .env file with sensitive tokens was added and multiple legacy workflows were removed.

    Reviewed Changes

    Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.

    Show a summary per file
    File Description
    test-components.sh Script for testing component execution and JSON YAML
    docs/documentation/cont-bench.md Documentation page for continuous benchmarking
    bench.yaml / bench.json / bench-google.json Sample benchmark data
    .github/workflows/cont-bench.yml New continuous benchmarking workflow
    .env Environment file with sensitive tokens
    .github/workflows/*.yml (other CI workflows) Legacy workflows deleted
    Comments suppressed due to low confidence (4)

    .github/workflows/cont-bench.yml:64

    • The Python script reads 'bench.json' in the root, but the workflow converts and writes to 'pr/bench.json'. Update the path to match where the file is written.
              with open('bench.json', 'r') as f:
    

    bench.yaml:5

    • Avoid hard-coded absolute paths; use relative paths or environment variables to improve portability.
          path: /home/mohammed/Desktop/cont-bench/benchmarks/5eq_rk3_weno3_hllc/case.py
    

    .env:1

    • The .env file commits personal access tokens, exposing secrets. Remove it from version control and use GitHub Secrets instead.
    TOKEN=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
    

    @Malmahrouqi3
    Copy link
    Collaborator Author

    Posted my GitHub account tokens, I will just disable them rn.

    @sbryngelson
    Copy link
    Member

    You can put back your workflow files. You will want to mimic the setup of the new bench.sh (roughly). I had to remove a couple of GitHub actions like store artifacts on Phoenix because the new SOCKS5 proxy doesn't allow for them (so far as I've figured out so far. I'm worried that all of your new commands here will cause some issues, but we shall find out.

    @sbryngelson sbryngelson marked this pull request as draft July 14, 2025 07:05
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Development

    Successfully merging this pull request may close these issues.

    2 participants